Multi-neighbor proportional forwarding in a network

ABSTRACT

A source-address independent, multi-path routing method for use in a communication network is disclosed. A network node uses an information packet&#39;s destination address to determine a plurality of neighbors it has a proportional likelihood to forward that packet to, such that packets can be sent over multiple paths to a destination node without undesirable looping. Multiple-neighbor proportional forwarding routing tables are determined by computing a directed graph of multiple optimal paths to each destination node. Optimal paths and forwarding proportions are based on link capacities. The multi-path routing method provides maximized throughput, reduced congestion and superior load balancing over single-path routing. Source-address independence also overcomes a major problem of more traditional source-address dependent multi-path routing methods. The source-address independent, multi-path routing method can be implemented in mixed networks where some nodes do not implement the method. Data structures and methods for determining the optimal paths and parameters are provided.

CROSS REFERENCE TO RELATED APPLICATION

This application claims the benefit under 35 U.S.C. §119(e) ofprovisional application Ser. No. 61/195,346, entitled “Stream-Basis SetDivision Multiplexing”, filed Oct. 7, 2008. The subject matter ofprovisional application Ser. No. 61/195,346 is incorporated herein byreference in its entirety.

TECHNICAL FIELD

The present invention relates generally to the sending of data over anetwork, and more particularly to the optimizing of routing tables fornetwork routing.

BACKGROUND

FIG. 1 (Prior Art) shows a strongly connected directed graph 100representing a network of routers and allowed links between adjacentrouters for an information packet directed to network router H 108. Thedirected graph 100 illustrates the use of Source-Independent Single-Pathrouting according to traditional routing schemes like minimum hops orshortest path. Oval shapes represent network routers, also referred toas hosts or “nodes”, that forward information packets from a sourcedevice, such as a home computer connected to a node, to a destinationdevice, such as another home computer connected to a node. Shown arenode A 101, node B 102, node C 103, node D 104, node E 105, node F 106,node G 107 and node H 108. Also shown are an example information packetsource device 109 connected to node E 105 and an example informationpacket destination device 110 connected to node H 108.

Where an information packet originates from a given source device, itmust always first pass through the node to which the source device isconnected. This first node will be referred to here as the sourcerouter. Where the information packet is destined for a given destinationdevice, the last node the information packet passes through is the nodeto which the destination device is connected. This last node will bereferred to here as the destination router. The destination router canbe determined from the packet's destination address using well knownnetwork topology protocols. Accordingly, where an information packetoriginates from source device 109 destined for destination device 110,the source node is node E 105, and the destination node is node H 108.An information packet typically includes destination data from which canbe determined the destination device and destination node, as well assource data from which can be determined the source device and sourcenode.

In the directed graph 100, darkened arrows indicate allowed linksbetween nodes for forwarding an information packet to destination node H108. For any given node which is to forward an information packet todestination node H 108, the information packet will always be forwardedto the same next adjacent node. The next adjacent node forwarded to isalso referred to as a “next hop”.

Arrow 111 indicates the link by which node A 101 forwards theinformation packet to node B 102. Arrow 112 indicates the link by whichnode B 102 forwards the information packet to node F 106. Arrow 113indicates the link by which node C 103 forwards the information packetto node G 107. Arrow 114 indicates the link by which node D 104 forwardsthe information packet to node E 105. Arrow 115 indicates the link bywhich node E 105 forwards the information packet to node F 106. Arrow116 indicates the link by which node F 106 forwards the informationpacket to node H 108. Arrow 117 indicates the link by which node G 107forwards the information packet to node H 108. Non-darkened arrowsillustrate links which are not allowed when the destination node is nodeH 108. A directed graph of paths to a different destination node wouldhave a different set of allowed and non-allowed links.

Thus, an information packet originating at source device 109 anddestined for destination device 110 will first pass to node E 105. NodeE 105 will always forward the information packet to next hop node F 106via link 115. Node F 106 will always forward the information packet tonext hop node H 108 via link 116. Node H 108 (the destination node) willthen forward the information packet to destination device 110.

As can be seen from the darkened link arrows, there is always a singleroute from a given node to a given destination node when using thisSingle-Path routing method. Because the route is determined based on thedestination of the information packet, and not determined based on thesource of the information packet, the method is referred to asSource-Independent Single-Path routing.

FIG. 2 (Prior Art) is a simplified view of an example node E 105implementing Source-Independent Single-Path routing. Node E 105 includesa processor 200 and a memory 201. Processor 200 and memory 201communicate via buses 202 and 203. Memory 201 stores network topologyinformation 204 according to a well-known protocol such as Open ShortestPath First (OSPF) or Intermediate System to Intermediate System (ISIS).An algorithm 205 running on the processor 200 uses the network topologyinformation 204 to determine which adjacent node should be the next hopan information packet takes from node E 105 given the destination nodeof the information packet. These destination-dependent next hops arestored in memory 201 as the routing table 206 for node E 105.

The algorithm 205 that builds the routing table uses a traditionalrouting scheme such as minimum hops or shortest path. These traditionalrouting schemes rely on forwarding a given information packet to asingle neighboring router based on its destination for several reasons.

First, the cost of each link between neighboring routers is constant.This link cost is also known as a link metric. Link costs can be basedon a link delay or a link transmission time. In the case of hopcounting, every link cost is set to be equal. When the link costs changefor any reason, so do the routing tables that determine forwardingneighbors. In the algorithm 205 that builds the routing table 206, thecost of a path between the forwarding node 105 and a destination node isdetermined by adding the costs of the intervening path links. And, thedynamic programming principle of optimality applies, meaning that allsub-paths of a path between the forwarding node 105 and a destinationrouter must be optimal. Determining the cost of a path by adding thecosts of the path links is known as the Link Metric Model.

Additionally, it is simple to avoid creating undesirable looping pathsor “directed cycles” using Single-Path routing methods. FIG. 3 (PriorArt) is a an expanded view of simplified routing table 206 for node E105. Row 300 indicates that the next hop when an information packet isdestined for destination router H 108 is node F 106. Row 301 indicatesthat the next hop node when an information packet is destined fordestination router F 106 is node F 106.

FIG. 4 (Prior Art) is a simplified view of an example routing table 400for node F 106, shown for comparison with the example routing table 206for adjacent node E 105, illustrating the lack of looping, also known asa “directed cycle”, between node E 105 and node F 106. Row 401 indicatesthat the next hop node when an information packet is destined fordestination router H 108 is node H 108. Row 402 indicates that thedestination router has been reached when an information packet isdestined for destination router F 106. By comparing routing table rows300 and 301 with routing table rows 401 and 402, it is seen that forinstances when the next hop node from node E 105 is node F 106, the nexthop node from node F 106 is never node E 105.

In this way, routing tables for adjacent nodes are constructed such thatinformation packets do not follow directed cycles in the Single-Pathrouting methods. And, because routing table next hops in Single-Pathrouting methods depend only on the destination of the informationpacket, and not the source, Single-Path routing methods can be referredto as Source-Independent Single-Path routing. However, problems inherentin Source-Independent Single-Path include over-utilizing, and thereforecongesting, the links along optimal paths. Conversely, other links canremain largely underutilized.

In contrast, Multi-Path routing methods can utilize more available linksand better avoid link congestion. Known Multi-Path routing methodscalculate a set of “best” disjoint paths for an information packet froma particular source to a particular destination node and channel trafficbetween those two nodes along those paths, in inverse proportion totheir costs. Disjoint paths do not share the same set of nodes or links.In a network, this has to be done for every possible pair of source anddestination nodes. FIG. 5 (Prior Art) is a directed graph 500 of thenetwork of routers of FIG. 1 (Prior Art) and links between adjacentrouters illustrating the use of Source-Dependent Multi-Path routing.

Darkened arrows in the directed graph 500 of a network indicate allowedlinks between nodes for forwarding an information packet according toMulti-Path routing when the source node is node A 501 and thedestination node is node H 508. A Multi-Path routing method would, forexample, use not only the path from node A 501 to node B 502 via link511, then from node B 502 to node F 506 via link 512, and then from nodeF 506 to node H 508 via link 514, but also the disjoint path from node A501 to node C 503 via link 517, then from node C 503 to node G 507 vialink 513, and then from node G 507 to node H 508 via link 515 totransport packets from source node A to destination node H. Non-darkenedarrows illustrate links which are not allowed when the source node isnode A 501 and the destination node is node H 508.

Because routing in known methods of Multi-Path routing requiresconsideration of the source of the information packet in order toprevent paths with directed cycles, known Multi-Path routing methods canbe referred to as Source-Dependent Multi-Path routing. As can be seenfrom the darkened link arrows, additional paths to the destinationrouter are available when using Source-Dependent Multi-Path routingrather than Source-Independent Single-Path routing method. Directedcycles are prevented because routing tables under Source-DependentMulti-Path routing methods refer to both the source and the destinationof an information packet to determine the next hop.

However, in a network with thousands of nodes, Source-DependentMulti-Path routing presents at least two obstacles. Calculating a set ofbest disjoint paths from every possible source node to every possibledestination node can be computationally intractable. Further, the memoryrequirements for storing a Source-Dependent Multi-Path routing table canbe impractical.

FIG. 6 (Prior Art) shows a portion of a simplified Source-DependentMulti-Path routing table 600 for node E 505 of FIG. 5 (Prior Art). Thesource-dependent entries cause the routing table 600 to balloon in size,even in our small example network of only eight routers. In a realnetwork of thousands of routers and hundreds of thousands of destinationaddresses, the Source-Dependent Multi-Path routing table can approachterabyte size.

SUMMARY

A source-address independent routing method called Multi-NeighborProportional Forwarding (MNPF) is disclosed. A proportional forwardingrouting table is present on the data path of the network node. In oneaspect, the MNPF capable node uses an information packet's destinationaddress to determine a plurality of neighbors it has proportionallikelihoods to forward that packet to. One of these neighbors is thenselected, independent of the source of that packet, and the packet isforwarded to the selected neighbor. The neighbors are selected in somepre-determined proportions to maximize throughput. The routing methodembeds minimal congestion and maximal load distribution in itsrequirement criteria. It includes procedures for determining the optimalproportions by which a neighbor is selected. It also includes proceduresfor determining the optimal multiple paths in the network along whichpackets can be routed without undesirable loops developing. The methodcan be used in typical Internet Protocol (IP) suite networks, or withother network protocols such as Asynchronous Transfer Mode (ATM).

The implementation of the MNPF routing method consists of two phases: asetup phase and a run phase. In one aspect, the setup phase gathersinformation about the network topology and the costs associated with thelinks in the network. In one aspect, costs are determined by referenceto the bandwidth or throughput of networks links. In another aspect,costs can be determined by reference to link delays or link transmissiontimes. Setup then outputs information in terms of optimal paths andparameters that are necessary to set up routing/forwarding tables. Theoptimal path and parameter information is described here in terms ofoptimal Forwarding Directed Acyclic Graphs (FDAGs) rooted at eachdestination node. In one aspect, the optimal FDAGs are changed when thenetwork topology changes, when the capacities of forwarding linkschanges, or when other information as to network conditions changes.

The setup phase exists in the control plane of a router. The run phaseis the use of said routing tables to route data packets. It is importantto emphasize that the MNPF method is a method of determining routingtables, and not a protocol.

In one aspect, an engine running on each network router determines theFDAGs for forwarding a packet to each destination node from any othernetwork node, and then sets up the appropriate proportional forwardingrouting table for the particular router. In another aspect, MNPF nodesare in a mixed network with routers that do not implement Multi-NeighborProportional Forwarding. In the latter case, determination of optimalFDAGs and MNPF routing tables takes into account that some nodes do notimplement the MNPF method. In yet another aspect, a subset of networknodes or computing devices other than routers can determine the optimalFDAGs and then assign MNPF routing tables to MNPF-capable nodes.

In one aspect, MNPF routing is implemented using a variant of the MNPFrouting table called the Random Neighbor (RN) routing table. The RNrouting table implements MNPF forwarding proportions using thresholdsand aliases, such that random selection of a next hop node can beperformed in the run time of the network.

In another aspect, MNPF forwarding proportions are approximated withinthe structure of a traditional format routing table using Stream BasisSet Division Multiplexing. In this variant of MNPF routing, forwardingproportions and packet order are maintained without random selection ofnext hop nodes and without changing the format of known routing tables.

The multi-path routing method provides maximized throughput, reducedcongestion and superior load balancing over single-path routing.Source-address independence also overcomes a major problem of moretraditional source-address dependent multi-path routing methods. Datastructures and methods for determining the optimal paths and parametersare provided.

Other structures and methods are described in the detailed descriptionbelow. This summary does not purport to define the invention. Theinvention is defined by the claims.

BRIEF DESCRIPTION OF THE DRAWINGS

The accompanying drawings, where like numerals indicate like components,illustrate embodiments of the invention.

FIG. 1 (Prior Art) shows a strongly connected directed graphrepresenting a network of routers and allowed links between adjacentrouters for Source-Independent Single-Path (SISP) routing.

FIG. 2 (Prior Art) is a simplified view of node E 105, expanded, of FIG.1, implementing Source-Independent Single-Path routing.

FIG. 3 (Prior Art) is an expanded view of the simplifiedSource-Independent Single-Path routing table for node E 105 of FIG. 2.

FIG. 4 (Prior Art) is a simplified view of an example routing table 400for node F 106, shown for comparison with the example routing table 206for adjacent node E 105

FIG. 5 (Prior Art) shows a strongly connected directed graphrepresenting a network of routers and allowed links between adjacentrouters, illustrating the use of Source-Dependent Multi-Path (SDMP)routing.

FIG. 6 (Prior Art) shows a portion of a simplified Source-DependentMulti-Path routing table 600 for node E 505 of FIG. 5 (Prior Art).

FIG. 7A is a Forwarding Directed Acyclic Graph (FDAG) 700 rooted atdestination node H 708, illustrating an aspect of Multi-NeighborProportional Forwarding (MNPF).

FIG. 7B is a table 722 corresponding to the FDAG 700 of FIG. 7A.

FIG. 8 shows an expanded view of node E 705 of FIG. 7, and adjacentnodes, with proportional likelihoods of forwarding an informationpacket, destined for node H 708, to one of a determined plurality ofneighboring nodes.

FIG. 9 is a simplified view of node E 705 of FIG. 8 illustrating oneembodiment.

FIG. 10 is an expanded view of a simplified proportional forwardingrouting table 906 for node E 705 of FIG. 9.

For comparison with the routing table 906 of FIG. 10, FIG. 11 is asimplified view of an example proportional forwarding routing table 1100for node F 706 of FIG. 8.

FIG. 12 illustrates an embodiment of a method for constructing theOptimal FDAG.

FIG. 13 is a diagram illustrating the use of the GetFlow function tofind maximum traffic flow from a given node, such that the function isreduced to a Linear-Programming problem in a standard form.

FIG. 14 shows unequal MNPF forwarding probabilities for a node havingneighbors A through F.

FIG. 15 illustrates line intervals of probabilities corresponding to theneighboring nodes of FIG. 14, covering a unit line interval.

FIG. 16 illustrates a typical decision tree 1600 for randomlydetermining the next hop based on the unequal MNPF proportions of FIG.14.

FIG. 17 illustrates an optimal pre-computed decision tree 1700 forrandomly determining the next hop based on the unequal MNPF proportionsof FIG. 14.

FIG. 18 shows an example Random Neighbor (RN) routing table, which is avariant of the MNPF routing table allowing random selection of anallowed next hop with the correct probabilities in constant-time.

FIG. 19 is the i^(th) record R[i] in an array of records for a nodehaving multiple adjacent next hop nodes illustrating an aspect of the“Alias” method.

FIG. 20 is a flowchart illustrating the run phase of a node implementingthe Alias Method and forwarding a packet according to the node's storedRandom Neighbor routing table

FIG. 21 illustrates an example of the run phase of the Alias method, inwhich the next hop neighbor is randomly selected.

FIG. 22 shows an example array of six records with cutoff probabilitiesand alias neighbor ID's corresponding to forwarding probabilities ofFIG. 14.

FIG. 23 shows a second possible example array of six records with cutoffprobabilities and alias neighbor ID's also corresponding to forwardingprobabilities of FIG. 14.

FIG. 24 shows the setup phase of the Alias method, in which the cutoffprobabilities and alias ID's for an RN routing table are computed, basedon an array of MNPF forwarding proportions.

FIG. 25 is a simplified view of a network node implementing the Aliasmethod of MNPF packet routing.

FIG. 26 shows a portion of an FDAG and discrete ranges of addressesserved by the destination node of the FDAG, illustrating an aspect ofMNPF routing using Stream Basis Set Division Multiplexing (SBSDM).

FIG. 27 shows a portion of a simplified SBSDM routing table.

FIG. 28 shows an FDAG with multiple routes to the destination node,illustrating an aspect of SBSDM.

FIG. 29 shows a rooted tree that structurally approximates the FDAGshown in FIG. 28, illustrating the concept of packet streams used in theSBSDM method.

FIG. 30 shows a different rooted tree that structurally approximates theFDAG shown in FIG. 28, illustrating the concept of packet streams usedin the SBSDM method.

FIG. 31 is a simplified view of a node in a network of nodesimplementing SBSDM.

DETAILED DESCRIPTION

Reference will now be made in detail to some embodiments of theinvention, examples of which are illustrated in the accompanyingdrawings.

FIG. 7 is a Forwarding Directed Acyclic Graph (FDAG) 700 of routers andlinks between adjacent routers in a network, illustrating an aspect ofMulti-Neighbor Proportional Forwarding. The FDAG is defined as a subsetof links for the network such that there are no directed cycles and suchthat all traffic can reach a destination node. In FIG. 7, thedestination node is node H 708. As with FIG. 1, oval shapes representnetwork nodes and arrows represent forwarding links between nodes. Shownare node A 701, node B 702, node C 703, node D 704, node E 705, node F706, node G 707 and node H 708. The FDAG is rooted at destination node h708. Darkened arrows denote the subset of allowed links in the FDAG 700of a network such that there are no directed cycles and such that aninformation packet at any node can reach node H 708. The illustratednodes are MNPF capable and forward the information packet to destinationnode H 708 via one of the more than one allowed next hop links usingMulti-Neighbor Proportional Forwarding.

Node A 701 can forward the information packet to node B 702 via link711, or to node C 703 via link 712. Node B 702 can forward theinformation packet to node D 704 via link 713, or to node F 706 via link714. Node C 703 can forward the information packet to node D 704 vialink 715, or to node G 707 via link 716. Node D 704 in the illustratedFDAG 700 rooted at destination node H 708 has only one allowed next hopto node E 705 via link 717.

Node E 705 can forward the information packet to node F 706 via link718, or to node G 707 via link 719. Node F 706 has only one allowed nexthop to destination node H 708 via link 720, and Node G 707 has only oneallowed next hop to destination node H 708 via link 721. By followingthe darkened arrows indicating allowed links, it is seen that a packetis never returned to a node from which it has been forwarded, thusavoiding directed cycles.

Non-darkened arrows illustrate links which are not allowed in the FDAG700 rooted at destination node H 108. In this way, the illustrated FDAG700 spreads network traffic over more than one path to a givendestination router, reducing link congestion in the network.

FIG. 7B is a table 722 corresponding to the FDAG 700 of FIG. 7A. Thetable 722 contains the proportions for forwarding an information packetto destination node H 708 from any other node in the network. Column“From Node” 723 to column “To Node” 724 indicates the direction of thelink and column “Proportion” 725 contains the fraction of trafficoriginating at a given node that is to be forwarded along that link.Note that the proportions going out of any “From Node” add up to 1.0.

FIG. 8 shows an expanded view 800 of node E 705 of FIG. 7, adjacentrouters, and links between node E 705 and adjacent routers. Also shownare the proportional likelihoods of each link being used to forward aninformation packet to one of multiple neighboring nodes in the casewhere the information packet is destined for destination node H 708.Node D 704 will always forward the information packet to next hop node E705, thus the indicated proportional likelihood 801 of the informationpacket at node D 704 being forwarded via link 717 to adjacent node E 705is 1.00 out of 1.00, or one hundred percent. The indicated proportionallikelihood 802 of the information packet at node E 705 being forwardedvia link 803 to neighboring node D 704 is 0.00 out of 1.00, or zeropercent. Because node E 705 never forwards an information packetdestined for destination router H 708 to node D 704, the informationpacket never follows an undesirable directed cycle between these twonodes.

The indicated proportional likelihood 804 of the information packet atnode E 705 and destined for destination router H 708 being forwarded vialink 718 to adjacent node F 706 is 0.60 out of 1.00, or sixty percent.The indicated proportional likelihood 805 of the information packet atnode F 706 being forwarded via link 806 to adjacent node E 705 is 0.00out of 1.00, or zero percent. Again, because node F 706 never forwardsan information packet destined for destination router H 708 to node E705, the information packet never follows an undesirable directed cycle.

Similarly, the indicated proportional likelihood 807 of the informationpacket at node E 705 and destined for destination router H 708 beingforwarded via link 719 to neighboring node G 707 is 0.40 out of 1.00, orforty percent. The indicated proportional likelihood 808 of theinformation packet at node G 707 being forwarded via link 809 toadjacent node E 705 is 0.00 out of 1.00, or zero percent. Because node G707 never forwards an information packet destined for destination routerH 708 to node E 705, the information packet never follows an undesirabledirected cycle between these two routers. However, because node E 705has a greater-than-zero proportional likelihood of forwarding theinformation packet to either of node F 706 and node G 707, node E 705spreads network traffic over more than one path to destination node H708.

FIG. 9 is a simplified view of node E 705 of FIG. 8 illustrating oneembodiment. Node E 705 includes a processor 900 and a memory 901.Processor 900 and memory 901 communicate via buses 902 and 903. Memory901 stores network topology information 904 according to a well-knownprotocol such as OSPF or ISIS. An FDAG set-up engine 905 running on theprocessor 900 uses the network topology information 904 for computingOptimal FDAGs for each possible destination node of the network. TheFDAG set-up engine 905 uses the computed FDAGs and proportions todetermine the appropriate entries for each destination node in a MNPFrouting table 906 for node 705. The proportional forwarding table 906for the node 705 is stored in memory 901 for later use in forwarding ofan information packet to its destination. Other MNPF capable nodes inthe network each run their own instance of the FDAG set-up engine 905 todetermine their own proportional forwarding tables. Note that FDAGengines can be run on a subset of MNPF capable routers, in which casethese routers determine routing tables for other MNPF capable routers.

FIG. 10 is an expanded view of a simplified proportional forwardingrouting table 906 for node E 705 of FIG. 9. Because the specific methodof generating the table described here is called Multi-NeighborProportional Forwarding (MNPF), the routing table can be referred to asan MNPF routing table. However, “proportional forwarding table” or “MNPFrouting table” can refer to any routing table data for proportionalforwarding of information packets to adjacent nodes without reference tothe source of the information packets.

Three columns are shown: “Destination Node” 1001, “Number of AllowedNext Hops” 1002, and “Forward To, Proportion” 1003. Since the nodes inthe example directed graph of FIG. 5 have either two or three neighbors,the Forward To, Proportion column 1003 shows space for up to three nexthops with proportional likelihoods for forwarding to each allowed nexthop. Row 1004 and row 1005 of the routing table 906 indicate destinationrouters for which node E 705 has a greater-than-zero likelihood offorwarding an information packet to node F 706. Proportional likelihoods804 and 807 from FIG. 8 are entered in the routing table 906 row 1005showing that, for destination router H 708, node E 705 hasgreater-than-zero proportional likelihoods of forwarding an informationpacket to one of node F 706 and node G 707.

For comparison with the routing table 906 of FIG. 10, FIG. 11 is asimplified view of an example Multi-Neighbor Proportional Forwardingrouting table 1100 for node F 706 of FIG. 8. Row 1101 and row 1102 ofthe routing table 1100 show that in cases where node E 705 has agreater-than-zero likelihood of forwarding an information packet to nodeF 706, node F 706 has zero likelihood of forwarding the informationpacket to node E 705.

FIGS. 10 and 11 are simplified example routing tables for Multi-NeighborProportional Forwarding. Actual routing tables involve more informationand may take different formats.

A Method for Constructing the Optimal FDAG

FIG. 12 shows an embodiment of the Multi-Neighbor ProportionalForwarding method for constructing the Optimal FDAG. During the setupphase, the MNPF approach constructs Optimal FDAGs for every possibledestination node in the network, and then assigns proportionallikelihoods of forwarding to allowed next hops for the MNPF routingtables. The method assumes that (i) a network is given with capacitiesspecified for every link; (ii) all nodes are capable of executing theMNPF method; (iii) a destination node tεN is specified. Optimal packetflow computation and determination of proportional forwardinglikelihoods for nodes of an FDAG starts at the destination node and thenmoves backwards toward the source nodes. The flow computation andaddition of the next source node to the FDAG, moving backwards from thedestination node, is dependent on the graph computed up to this pointand the forwarding proportions computed up to this point.

Let the network be represented by the directed graph G=(N,A), where N isthe set of nodes and A⊂=N×N is the set of links with capacitiesassociated with each of them. Node t is the destination node. Given thisinput, the Optimal FDAG finding method 1200 is shown in FIG. 12.

The method in module 1201 automates the following procedure: Given anetwork topology and a destination node, it maximizes the rate ofpackets flowing to that destination (see FIG. 12) for an FDAG rooted atthe destination node. The method not only provides the structure of theOptimal FDAG but also computes all the optimal forwarding proportions tothe destination node.

The steps of the method are given in 1201. The method works on nodelists

(for permanent) and

(for temporary) of the resulting FDAG F_(t). At every stage of themethod the invariant

∪

=N is maintained. Initially

is empty and

=N. Then, at each step of the method, a node having the greatest flow(see FIG. 12) is moved from list

to list

until

is empty.

In particular note that the array ρ in 1201 gives the reversetopological numbers of the nodes in

. In order to do this for a node, 1201 records the step number (providedby the counting variable step) at which that node has been added to

.

Step 9) of 1201 is the greedy step of the method where a node u ofmaximum flow φ_(t)(u) is selected from the list

. Step 10) of 1201 deletes node u from

. Immediately in step 11) of 1201, the reverse topological index of u isset in the global array ρ. Step 13) inserts u into

. At this step, the forwarding neighbors of u, the forwardingproportions, and the maximum flow φ_(t)(u) from u are recorded.

Once u is inserted into

, step 14) of 1201 finds the maximum flow values of all the remainingnodes in

that point to u, by calling the GetFlow procedure 1202 (see FIG. 12) onthem in step 15). This is necessary so that at the next iteration of theMain Loop, step 9) of 1201 can be executed correctly.

Tie-Breaking for Constructing the Optimal FDAG

One of the problems in step 9) of 1201 is to break a tie betweencontending nodes that may all have the same value for flow φ_(t). Toachieve this, the following heuristic is adopted:

Every node has a unique label and these labels come from an ordered set.Furthermore, a record is kept for how many times a particular node hasbeen considered for inclusion in the permanent set

. Given a contention between two nodes having the same maximum flow, thenode that has been considered more often is declared the winner in thetiebreaker. If both of them have been considered the same number oftimes then their unique labels are used to break the tie.

The GetFlow METHOD

In this section we describe the GetFlow procedure 1202. Given a partialFDAG, this procedure determines the next candidate node in

for inclusion in the permanent set

.

At the heart of the GetFlow procedure 1202 is the n×a Unit-flow MatrixM, where n is the number of nodes and a is the number of links in thedirected graph G=(N,A). At any stage of the method in module 1201, Mkeeps a record of the amount of flow that would go through link, if aunit flow were sent from a node uε

. Thus the ij^(th) entry of M[i, j] gives the amount of flow throughlink j if a unit flow were sent from node iε

.

Consider the general case 1300 depicted in FIG. 13, where v is a node inset

and it has m forwarding neighbors u₁, . . . , u_(m) all belonging to set

. This condition corresponds to step 15) in module 1201 of FIG. 12, atGetFlow 1202. Then, to find the maximum flow from node v, one needs tomaximize the flow f=f+ . . . +f_(n), where f_(i) is the flow sent tonode u_(i) for 1≦i≦m. Note that the non-negativity constraint f_(i)≧0applies for 1≦i≦m.

This flow maximization is also subject to the capacity constraints ofeach link. Therefore, if c₁, . . . , c_(m) are the capacities of linksv→u₁, . . . , v→u_(m), the constraint is f_(i)≦c_(i) for 1≦i≦m. And,

${\sum\limits_{i = 1}^{m}\;{{M\left\lbrack {u_{i},l} \right\rbrack}f_{i}}}\underset{\_}{<}c_{l}$for all links l with capacity c_(l), both of whose endpoints belong to

. This problem is therefore reduced to a Linear-Programming problem in astandard form.

A variety of methods are known that solve the general Linear-Programmingproblem. However, the context of this particular problem makes theapplication of a general Linear-Programming Solver excessive. Inparticular, the fact that the problem is on a rooted directed acyclicgraph helps in achieving a lower time complexity solution to thisproblem.

Let φ be the maximum flow found from node v, and φ_(i) represent thecorresponding flow that it sends to forwarding neighbor u_(i) (1≦i≦m).In the final bookkeeping step of method GetFlow 1202, the v^(th) row ofmatrix M is updated as follows:

-   -   M[v,v→u_(i)]=φ_(i)/φ for 1≦i≦m

${M\left\lbrack {v,l} \right\rbrack} = {\frac{1}{\phi}{\sum\limits_{i = 1}^{m}\;{{M\left\lbrack {u_{i},l} \right\rbrack}\varphi_{i}}}}$for all arcs l both of whose endpoints belong to

.

Note that if v is chosen for inclusion into

in step 9) of module 1201, then the v^(th) row of matrix M cannot changeany more. Otherwise, there is always a possibility that it could changev is considered once again in step 15) of 1201. Also note that if v ischosen in step 9), the contents of M[v,v→u_(i)] for 1≦i≦m are theoptimal forwarding proportions from v to its forwarding neighbors u₁, .. . , u_(m).

A Variant of the Method for Mixed Networks

For the explanation of FIG. 12 (see above) we assumed that all nodes inthe network were capable of executing the MNPF method. In real life,that will not always be the case. MNPF capable nodes will beintermingled with nodes that follow a Single-Path Routing protocol. Wetherefore now study a variant of the above method for this case. Thedifferences in the assumptions are: (i) not all nodes are capable ofexecuting the MNPF method; (ii) these non-MNPF capable nodes follow aSingle-Path Routing protocol; (iii) all MNPF capable nodes are aware ofthe capabilities of all the other nodes in the network; and, (iv) allnon-MNPF capable nodes assume that all nodes are similar to themselves.Given that all Single-Path Routings establish a directed tree Ψ_(t)rooted at the destination node t, the method simply allows MNPF capablenodes to grow their multiple paths starting from this skeletal tree.While this does not lead to optimality, this variant method is importantfor interoperability in a mixed node network.

The main differences between this method and the Optimal FDAG method ofFIG. 12 are: (1) instead of developing its own topological orderingthough the array ρ, this array is given to the method as input by thetopological ordering of the tree Ψ_(t); (2) since the array ρ definesthe order in which nodes are added in the tree Ψ_(t), this method usesthe same order for addition to its permanent set

, instead of determining a node with the maximum flow as in step 9) ofmodule 1201; and, (3) every non-destination node in the directed treeΨ_(t) has exactly one forwarding neighbor. An MNPF capable node remainsat zero flow if its forwarding neighbor in Ψ_(t) has not yet been addedto the permanent set

.

The Physical Model for Calculating the FDAG

The Link Metric Model for calculating the costs of links is discussedabove in regard to FIG. 1. While the Link Metric Model can be used in anMNPF setting, it is found that, in high-traffic situations, the LinkBandwidth Model for physically modeling the network in order to computeOptimal FDAGs gives better results in terms of traffic flow. In thismodel, the bandwidth of every link in the network is specified. Inpractice, these bandwidths are known for every link. Furthermore, byinformation exchange, every node in an autonomous routing area hasknowledge of the bandwidths of all the links in that area.

Bandwidth is a term synonymous with capacity in communication networks,and is normally given in units of bits-per-second. So long as the packetflow rate on every link is less than its capacity, the network operatessmoothly at “line speed”. A link is incapable of transmitting packets ata rate greater that the bandwidth. Indeed, if the demand for a linkexceeds the operating bandwidth, packets get dropped. With someprotocols, e.g. Transfer Control Protocol, the point-to-point throughputdecreases because of packet drops.

For any given destination node, the aim of the MNPF forwarding method isto maximize the flow of all traffic destined for that node. However,there are two facts to consider in such a maximization procedure: (1) anon-destination node is unaware of the traffic generated in any otherpart of the network; and, (2) a non-destination node has to maximize itsflow subject to the restrictions imposed by the forwarding proportionsof the other non-destination nodes.

Given these two facts, the “best effort” scenario for flow maximizationfor a non-destination node is this: (1) it assumes that it is the onlynode transmitting packets to the destination node; (2) no other packetstream has consumed any portion of the bandwidths of the links that ituses to transmit these packets; and, (3) the intermediate nodes that ituses for transmitting these packets use proportional forwarding derivedfrom these same considerations. As explained in this section, not allthe intermediate nodes have to use proportional forwarding. However,packet flow results are sub-optimal in this case.

A Novel and Efficient Routing Table for Non-Uniform Random NeighborSelection

A proportional forwarding table on the data path of a non-destinationnetwork node predetermines a policy by which the node is to forward aproportion of packets destined for a given destination node to a givenneighboring node. One example of such a table is given by MNPF. In oneembodiment, such a policy is achieved by forwarding packets on aper-packet basis, based upon a random outcome.

If proportional likelihoods in a node's MNPF routing table arerelatively uniform for a given destination node entry, a typical methodfor random selection of a next hop node can be performed withinreasonable time and memory requirements. For example, the firstoperation in a typical decision tree approach is to generate a randomnumber u from a uniform distribution with range [0,1]. If the randomnumber u is compared with uniform proportional likelihoods considered asintervals along a unit line, the next hop decision is made in arelatively few number of decision tree steps.

However, optimal proportions for MNPF routing are rarely uniform.Optimal proportions for forwarding packets via multiple paths to thespecified destination node are generally unequal. Consider a node havingthe six forwarding neighbors A through F with the forwardingprobabilities as shown in FIG. 14.

FIG. 15 illustrates line intervals of probabilities corresponding tothese neighbors covering a unit line interval. Then given a randomnumber u, the interval it falls in is decided by a sequence ofcomparisons. If these comparisons are done left-to-right sequentially inthe order shown in FIG. 15, they result in the decision tree 1600 shownin FIG. 16.

Since the random numbers are generated from a uniform distribution, 43%of the time (over a large number of trials) they fall in the intervalcorresponding to neighboring node F. However in order to make thedecision to select neighboring node F, this decision tree has to makefive comparisons every time!

In light of the above discussion, a decision tree that minimizes theaverage number of comparisons at a given node should be pre-computed atevery node. In other words, given: 1. A non-destination node xεV withforwarding neighbor set H_(x)⊂N, and 2. P_(xy)—the forwardingprobabilities from node x to node y, ∀yεH_(x), the optimal decision treeis the one that minimizes T_(x) the average number of comparisons at xwhere,

$T_{x} = {\sum\limits_{y \in H_{x}}\;{p_{xy}T_{xy}}}$and T_(xy) denotes the number of comparisons to decide y.

FIG. 17 illustrates an optimal pre-computed decision tree 1700 forrandomly determining the next hop based on the unequal MNPF proportionsof FIG. 14. For optimal decision tree 1700, the average number of “u>p”comparisons made is 2.26. In comparison, the average number of “u>p”comparisons made using the decision tree of FIG. 16 is 4.24. However, inspite of its elegance and its optimality, the optimal decision tree 1700has drawbacks, including: 1) Too many “u>p” type comparisons in order toselect one next hop neighbor; 2) The associated space (memory) requiredfor storing the optimal decision tree; 3) The time required for settingup the optimal decision tree.

The “Alias” Method described below in the section that follows, has noneof these drawbacks. It is therefore the method of choice for random nexthop neighbor selection in Multi-Neighbor Proportional Forwarding. TheAlias Method builds a variant of the MNPF routing tables for each MNPFcapable node. The MNPF routing table variant will be referred to as theRandom Neighbor (RN) routing table. Note that the example MNPF routingtables of FIG. 10 and FIG. 11 also allow for random selection of a nexthop. What is described with respect to the RN routing table is a variantof the MNPF routing table that allows improved speed of random next hopselection at run time. The RN routing table is a type of MNPF routingtable.

An example RN routing table 1800 for node E 705 shown in FIG. 18 hasthree fields per record—Destination Node 1801, also called the key,Number of Next Hops 1802 for this destination (first part of the rule),and the List of Forwarding Nodes 1803 with thresholds and aliasesassociated with each of them (second part of the rule). Thresholds mayalso be referred to as cutoff probabilities. Aliases may also bereferred to as alias ID's. For details on the meaning of the termsthreshold and alias, the reader is referred to the section that followson the Alias Method. The RN routing table 1800 uses thresholds andaliases to implement the forwarding proportions shown in FIG. 10.

The “Alias” Method

The Alias method allows the random selection of the next hop neighborwith the correct probabilities in constant-time in the worst case. Thereis however, a linear setup time and extra linear storage. In fact, ourmethod for the setup phase of this method is in-line (i.e. without anyextra memory overhead). The “alias” method for generating discreterandom variates is known in the art.

Let xεN be any non-destination node with a set H_(x) of m forwardingneighbors. A way to implement this method is to have an array m recordsR[0 . . . m−1] at node x, each record having a unique forwardingneighbor ID, the probability of forwarding to that neighbor, plus twoother pieces of information—the cutoff probability and the aliasneighbor ID. The i^(th) record R[i] in the array can thus be viewed asseen in FIG. 19 where x_(i)εH_(x), the cutoff probability θ_(i)ε[0,1],and the alias neighbor ID a_(i)εH_(x) is the ID of (usually another)forwarding neighbor.

The Alias Method: Run Phase

FIG. 20 is a flowchart illustrating the run phase of a node implementingthe Alias Method and forwarding a packet according to the node's storedRandom Neighbor routing table. In step 2000, a packet destined for adestination router t arrives at node x. In step 2001, the node xdetermines if the destination node t of the packet is itself. If yes,the packet has arrived at its destination node in step 2002 and thepacket does not have to be forwarded according to the node's storedRandom Neighbor routing table. If no, the packet is not at itsdestination node, and so in step 2003 the node generates a uniformrandom variate u within the set [0,1]. Next, in step 2004, the generateduniform random variate u is used to determine the next hop by referenceto the Random Neighbor routing table entry for destination node t.Finally, in step 2005, the packet is forwarded to the determined nexthop node y on its path to destination node t.

The simplicity and efficiency of the Alias method is best described byconsidering the run phase of the method shown in FIG. 21, in which thenext hop neighbor is randomly selected. Assume that from the setup phaseof this method, the array of records for node x has already beencreated. Then, given a uniform random variate u from [0,1], the method2100 shown in FIG. 21 returns the next hop ID for node x.

Note that 2100 is the (only) portion of the step 2004 shown in FIG. 20.Also note that steps 1) and 2) of method 2100 create two independentrandom variates i and v from the single random variate u. In practice,from a numerical point of view, generating two random numbers may workbetter. The time complexity of this random selection method 2100 is thusO(l).

We show how this method works on the example of FIG. 14. FIG. 22 showsan array of six records corresponding to this example.

As a check, consider Pr(F)—the probability of selecting neighbor ID F. Foccurs as the alias in records 0, 1, and 2, and as the neighbor ID ofrecord 5. Each of these records is chosen uniformly with probability1/6. Therefore:

${\Pr(F)} = {{\frac{1}{6}\left\{ {\left( {1 - \theta_{0}} \right) + \left( {1 - \theta_{1}} \right) + \left( {1 - \theta_{2}} \right) + \theta_{5}} \right\}} = {{\frac{1}{6}\left\{ {0.70 + 0.58 + 0.40 + 0.90} \right\}} = {\frac{2.58}{6} = 0.43}}}$This is exactly the forwarding probability for node F in the recordarray of FIG. 22.

Note that the record array is not unique by any means. FIG. 23 showsanother possible record array for the same example. Again computingPr(F), we get the forwarding probability of F:

${\Pr(F)} = {{\frac{1}{6}\left\{ {\left( {1 - \theta_{0}} \right) + \left( {1 - \theta_{1}} \right) + \left( {1 - \theta_{2}} \right) + \left( {1 - \theta_{3}} \right) + \theta_{5}} \right\}} = {{\frac{1}{6}\left\{ {0.70 + 0.58 + 0.40 + 0.10 + 0.80} \right\}} = {\frac{2.58}{6} = 0.43}}}$

Alias Method: the Setup Phase

In this section, a linear-time in-line method is presented that computesthe cutoff probabilities and alias ID's for an RN routing table, givenan initial array of MNPF records that does not contain the cutoffprobabilities and alias ID's. This setup phase of the “Alias” method2400 is shown in FIG. 24. The “main” method is provided in procedureAliasSetup in module 2402.

In procedure AliasSetup 2402, the cutoff probabilities in the recordarray are initialized to m times the corresponding forwardingprobability (i.e., θ_(i)←m×p_(i)). This operation will (in general)create cutoff probabilities that are greater than 1. Next, AliasSetup2402 calls Pivot 2401. Pivot 2401 is a function similar to thewell-known Quicksort method that partitions records (by exchanging them)with the Cutoff Probability as key, pivoting them around the pivot valueof 1, such that all records with cutoff probabilities ≦1 occur to theleft of those with cutoff probabilities >1. The Pivot 2401 function alsoreturns the position of the first record with Cutoff Probability. If nosuch record exists, it returns a position beyond the bounds of the arrayby executing line number 10) of 2401.

Note that the function Pivot 2401 is a linear-time in-line method thatperforms the exchange-partition of records by known methods of moving aleft pointer l to the right, and moving a right pointer r to the left.

The procedure AliasSetup 2402 is also linear-time because at step 10) ofthe procedure the pointer i is always incremented to the next record.Furthermore, the entire procedure is in-line without any extra storagerequirements. The Alias ID and Cutoff Probability information updatingsteps of 8) and 9) respectively of 2402 are the most critical steps ofthe Alias method and constitute the reason this method works.

FIG. 25 shows a node 2500 that implements the Alias method of MNPFpacket routing. The node 2500 includes a processor 2501 and a memory2502. Processor 2501 and memory 2502 communicate via buses 2503 and2504. Memory 2502 stores network topology information 904 according to awell-known protocol such as OSPF or ISIS. An FDAG set-up engine 905running on the processor 2501 uses the network topology information 904for computing Optimal FDAGs for each possible destination node of thenetwork.

An engine implementing the Alias setup method 2400 runs on the processor2501 and uses the forwarding proportions of the computed FDAGs todetermine the appropriate Threshold and Alias rules to associate withallowed Forward To nodes for each destination node in a RN routing table1800 for node 2500. Memory 2502 stores the RN routing table 1800. TheAlias Run Phase engine 2100 running on processor 2501 uses the RNrouting table 1800 to determine the next hop when an information packetis to be forwarded to another node.

Stream Basis Set Division Multiplexing

Stream Basis Set Division Multiplexing (SBSDM) describes a system andmethods for implementing MNPF on existing routers with traditionalrouting tables, i.e., a routing table in which a single next hop iskeyed to a destination address or range of addresses. Exploiting thefact that every node is or can be made to be responsible for amultiplicity of addresses, the present invention achieves closeapproximation to the optimal paths and proportions prescribed by theMNPF method on a traditional router. Since these addresses are alreadypresent in a traditional router's routing table, there is no increase inspace. SBSDM consists of an algorithmic engine that generates the rulesassociated with each address in said table. SBSDM therefore workslocally within individual traditional routers in the network. Its effecthowever is global.

The advantages of this scheme are:

A) SBSDM does not interfere with the data path of the router where fasttable look-ups happen. B) While achieving the Multi-Path Routingobjectives of congestion avoidance and load balancing, the simplestructure of a traditional routing table is retained. Memory space andtable look-up time are preserved. C) The concept of packet streamsallows packet ordering to be maintained in the network.

Note that while SBSDM is an MNPF scheme, it does not involve randomselection of next hop nodes. Rather, two packets with the samedestination address always take the same path. However, two packets withdifferent destination addresses served by the same destination node maytake different paths. Thus, packets take multiple optimal paths to adestination node, and forwarding of packets by nodes along these pathsapproximates MNPF proportions.

FIG. 26 shows a portion of an FDAG similar to that of FIG. 7, with MNPFrouting of packets from node E 2605 to node H 2608. Node H 2608 isresponsible for forwarding those packets onward to a range of addressesvia sub-net node 2609, sub-net node 2610, sub-net node 2611, sub-netnode 2612, or sub-net node 2613. Node E 2605 forwards 60% of packetsdestined for node H 2608 to node F 2606 via link 2614, and 40% ofpackets destined for node H 2608 to node G 2607 via link 2615, accordingto MNPF routing as shown in the routing table entry 1005 for node H inFIG. 10.

In a router implementing SBSDM, an SBSDM algorithmic engine resides inthe control path, isolated from the data path. The SBSDM engine takesthe forwarding proportions dictated by MNPF and runs an optimizationengine to provide the contents of a traditional routing table such thatthe MNPF forwarding proportions are approximated.

FIG. 27 shows a portion of a simplified traditional routing table 2700for node E 2605 with the MNPF forwarding proportions of routing tableentry 1005 of FIG. 10 approximated according to SBSDM. The DestinationRange column 2701 contains entries for the five sub-nets served by nodeH 2608. Corresponding Next Hop column 2702 entries show that packets forthree of the five ranges of destination addresses are forwarded to nodeH 2608 via node F 2606. Packets for the other two of the five ranges ofdestination addresses are forwarded to node H 2608 via node G 2607. TheSBSDM algorithmic engine generates the rules associated with eachaddress or range of addresses in the table. FIG. 27 is, therefore, aportion of an SBSDM routing table, which is a type of MNPF routingtable.

Thus, assuming approximately equal traffic to each range of addresses,the forwarding proportion of 60% of packets forwarded to node H 2608 vianode F 2606 is maintained, and the forwarding proportion of 40% ofpackets forwarded to node H 2608 via node G 2607 is maintained. Ofcourse, in a real world situation, the ranges of addresses assigned nexthop rules may be partitioned such that they do not align perfectly withthe address ranges of sub-net routers served by the destination node.

SBSDM provides an approximation to MNPF. Since MNPF can be made to workin networks that contain a mixture of both MNPF enabled and non-MNPFenabled routers, SBSDM works in the same fashion as MNPF in such mixednetworks. This fact allows a mixture of routers to co-exist within anetwork—those that perform SBSDM and those that choose not to.

To understand Stream Basis Set Division Multiplexing engine algorithms,it is necessary to understand the concept of streams.

Definition of a Stream

A stream of packets destined for a particular node is associated withthat destination node. A stream of packets is defined to be a sequenceof packets sent from a source node that follow a unique path to thedestination node associated with the stream. From this definition, itfollows that for any stream, the set of paths from all nodes to saidstream's destination node is a rooted tree, rooted at that destinationnode.

Note that traffic destined for a particular node may be composed ofdifferent streams. Each such stream while having the same root namelythe destination node may have different links in its rooted tree. On theother hand, it is possible for two or more different streams having thesame root to have identical rooted trees. This is not a problem so longas the links of any such rooted tree is a subset of the FDAG links forsaid destination node (see the section entitled “THE CONCEPTS BEHINDAPPROXIMATING MNPF WITH SBSDM”, below).

It is noteworthy that the entire traffic destined for a particular nodeis composed of streams. Streams are independent, i.e. an individualpacket can be a member of one and only one stream.

Implementation of a Stream: Definition of Stream Basis Sets

The most important characteristic of a stream is that there is a uniquepath from a source node to its destination node. This implies thatpackets in a stream will arrive at the stream's destination node in thesame order as they were sent from a source node. There are manyapplications and protocols that work most efficiently when packetsarrive in order.

At any non-destination node, one can implement a stream based on thedestination address or destination address ranges. In any communicationnetwork, a destination node is responsible for a set of destinationaddresses. Thus, there could be as many streams associated with adestination node as the number of destination addresses that it isresponsible for.

For maximum utilization of network capacity, a large number of streamsshould be associated with the same destination node. In instances wherea destination node is responsible for only a few destination addresses,the number of streams associated with a destination node can beincreased by using a combination of its destination address anddestination port number (if the protocol has a port) or other header bitpositions to further subdivide the traffic meant for that particulardestination. Note that the packet header contains the destinationaddress or port number or header bit positions.

Thus in general terms, given a destination node t, a stream of packets Sassociated with t can be implemented by switching based on appropriatebit positions in the headers of those packets. Two streams S and S′ bothassociated with the same destination node t will differ in value in atleast one of those bit positions. The entire set of packet headerpositions and their values that implement all streams that can beassociated with destination node t will be called the full stream basisset

(t) of t. Any stream S associated with destination node t has a uniquestream basis bε

(t). As a consequence, any set of streams associated with destinationnode t corresponds to a unique stream basis set

⊂

(t).

The Concepts Behind Approximating MNPF with SBSDM

The most important characteristic of a stream S is that there is aunique path from every node to its destination node t. This implies thatfor any stream S, the set of paths from all nodes to its associateddestination node t is a rooted spanning tree T_(s), rooted at t.

On the other hand, MNPF prescribes an FDAG F_(t), which is a rootedspanning directed acyclic graph rooted at t, for every destination nodet. Therefore, SBSDM methods that are required to structurallyapproximate F_(t) with streams can do so by maintaining that for everystream S with destination node t, the tree T_(s) is a subgraph of F_(t).

An illustration of this structural requirement is provided in FIG. 28.Example FDAG 2800 shows an FDAG F_(t) with destination node t 2802,determined by MNPF. This FDAG shows that packets can be routed fromsource S 2801, to destination t 2802, by following six different paths:S-B-F-t, S-B-D-E-F-t, S-B-D-E-G-t, S-C-D-E-F-t, S-C-D-E-G-t and S-C-G-t.

FIG. 29 shows a rooted tree 2900 rooted at t that structurallyapproximates the FDAG shown in FIG. 28 for streams associated withdestination t. Rooted tree 2900 shows that the stream of packets from Sto t would follow the unique path s-B-F-t. Similarly, the stream from Dto t would follow the unique path D-E-F-t.

FIG. 30 shows another rooted tree 3000 rooted at that structurallyapproximates the FDAG shown in FIG. 28 for streams associated withdestination L. Rooted tree 3000 shows that the stream of packets from Sto t would follow the path S-B-D-E-F-t, and from C to t would follow thepath C-G-t.

MNPF also specifies the optimal proportions at every non-destinationnode x with which packets destined for node t must be forwarded alongthe links of FDAG F_(t). SBSDM methods are required to numericallyapproximate these optimal proportions using streams. There is obviouslya need to establish a correspondence between the concept of proportionsin MNPF and a measurable entity in SBSDM. This entity should be relatedto the number of packets carried by a stream over a finite long time,i.e. the asymptotic number of packets per unit time. A way toapproximate MNPF proportions is to introduce the concept of the packetrate of a stream in SBSDM.

The packet rate ρ(S) of a stream S is defined to be the number ofpackets carried by S per unit time. In this document we shall assumethat the packet rate of a stream does not change with time. Also, sinceany stream S associated with destination node t is implemented by aunique stream basis bε

(t), we can define and equate the packet rate ρ(b) of a stream basis bas the packet rate of the corresponding stream S. Finally, we define thepacket rate ρ(

) of a stream basis set

⊂

(t) as

${\rho(B)} \equiv {\sum\limits_{b \in B}\;{{\rho(b)}.}}$

SBSDM methods achieve both the structural and the numericalapproximations to MNPF with streams in a network of routers. They do soby dividing or partitioning the input stream basis set arriving at everynon-destination node into a set of mutually exclusive output streambasis sets—hence the name Stream Basis Set Division Multiplexing. Allstreams having their stream basis in one of these output stream basissets are now sent out along exactly one of the forwarding links of saidnon-destination node.

There are two flavors of SBSDM methods. The simpler of the two is theBatch-Mode method. The more complicated flavor is the Iterative-Modemethod.

Batch Mode SBSDM

Let the input set of streams associated with destination node t andarriving at node x≠t, have a stream basis set

⊂

(t). The batch-mode SBSDM method for destination node t, executing atnode x≠t, then works as follows:

It first runs the MNPF method for destination node t. MNPF specifiesthat x should forward packets destined for t in optimal proportions {p₁,. . . , p_(m)} to its m neighboring nodes {x₁, . . . , x_(m)}.

The method then partitions the input stream basis set

into m mutually exclusive output stream basis sets {

₁, . . . ,

_(m)} such that for 1≦i≦m, all streams having their stream basis in

_(i) are sent along forwarding link x→x_(i) and the proportion p_(i) isclosely approximated by the ratio ρ(

_(i))/ρ(

). See the section below entitled “APPROXIMATING MNPF PROPORTIONS” forfurther details on how this may be achieved.

Note that the batch-mode SBSDM method approximates MNPF bothstructurally and numerically, as defined above in the “THE CONCEPTSBEHIND APPROXIMATING MNPF WITH SBSDM” section. This approximation canalso be considered to be a classic quantization problem, since p_(i) isbeing quantized. In cases where |

| is large, such an approximation is quick and easy to implement.

It is necessary for every non-destination node x≠t to consider the fullstream basis set

(t) of t as its input stream basis set

. This is because x could be receiving a proper subset of

(t) from its neighbors, and could itself be generating packets destinedfor t. Also, such a scheme allows every node in the network the freedomto unilaterally implement a new stream for t by using a hitherto unusedstream basis bε

(t). Besides, it is difficult to foresee what might happen in the futurewhen more nodes that could send their packets to t via x are added tothe network. This freedom, whereby each node can independently andunilaterally create its distribution of traffic across various streams,also implies that nodes do not need to share their implementations ofstreams with other nodes.

As opposed to the iterative-mode SBSDM (see the section below), thismethod has the advantage that any change in |

| or equivalently |

(t)| does not structurally change the FDAG F_(t). The only change wouldbe to shuffle the output streams assigned to the forwarding links.

Iterative Mode SBSDM

The iterative-mode SBSDM method is more complicated than the batch-modeSBSDM method. While keeping the spirit of the MNPF optimization engine,it modifies it to approximate the parameters at every stage of runningthe MNPF engine. In contrast, the batch mode approximates the parametersat the end.

The input to the method is: 1) The network represented by the directedgraph G=(N,A), where N is the set of nodes and A⊂N×N is the set of linkswith capacities associated with each of them, and, 2) Destination nodetεN.

The MNPF method works on node lists

(for permanent) and

(for temporary) of the resulting FDAG F_(t). At every stage of themethod, the invariant

∪

=N is maintained. Initially

is empty and

=N. Then at each step of the method, a node uε

having a greatest flow φ_(t)(u) is moved from list

to list

until

is empty. The entire iteration is started by setting φ_(t)(t) to ∞.

It is therefore evident that the computation of the flow φ_(t) iscrucial. This is carried out in the GetFlow method of MNPF. Given anon-destination node xε

with m neighboring nodes {x₁, . . . , x_(m)}, where x_(i)ε

and (x→x_(i))εA for 1≦i≦m, the GetFlow method computes φ_(t)(x). It doesthis in essence by computing for 1≦i≦m, the maximum flow φ_(i) alongforwarding link x→x_(i) and finally setting

${\phi_{t}(x)} = {\sum\limits_{i = 1}^{m}\;{\varphi_{i}.}}$The MNPF proportion p_(i) along link x→x_(i) is computed byφ_(i)/φ_(t)(t).

The iterative-mode SBSDM recognizes the fact that these MNPF proportionscan only be approximated by streams using the methods described in the“APPROXIMATING MNPF PROPORTIONS” section that follows. Thus, instead ofusing the proportion p_(i) along link x→x_(i) for 1≦i≦m as computed bythe GetFlow method, it modifies p_(i) to the approximation p_(i)′ givenby methods described in the following section. It is clear that theresulting FDAG F_(t)′ arising from repeated application of theseapproximations at every step of the above method may not equal toF_(t)—that obtained by the “pure” MNPF method.

Note that the iterative-mode SBSDM method approximates MNPF bothstructurally and numerically, as defined in the section above entitled“THE CONCEPTS BEHIND APPROXIMATING MNPF WITH SBSDM”. As opposed to thebatch-mode SBSDM method, even the structure of MNPF is approximated suchthat an approximate FDAG F_(t)′ is obtained.

The iterative-mode SBSDM method may perform better than the batch-modemethod because it is more tuned to the application. On the other hand,the iterative-mode method has the disadvantage that any change in

(t) structurally changes the resulting approximate FDAG F_(t)′. It cantherefore be used only when

(t) is reasonably stable.

As in the batch-mode SBSDM method, each node in the iterative-mode SBSDMmethod can independently and unilaterally create its distribution oftraffic across various streams. Thus, nodes in the iterative-mode SBSDMmethod also do not need to share their implementations of streams withother nodes.

Approximating MNPF Proportions

Let

⊂

(t) be the input stream basis set for destination node t, arriving atnode x≠t. Proportions {p₁, . . . , p_(m)} on the m forwarding links arespecified apriori. The problem is to find a partition of

into m mutually exclusive output stream basis sets {

₁, . . . ,

_(m)} such that for the ratio ρ(

_(i))ρ(

) closely approximates

Let

={b₁, . . . , b_(s)} where the b's represent the individual stream basesfor t and let w_(i)≡ρ(b_(i))/ρ(

) be defined as the normalized packet rate of stream basis b_(i) for1≦i≦s. Then this problem reduces to the following partitioning problem:

Given a set of s positive real numbers

≡{w₁, . . . , w_(s)} with

${{\sum\limits_{i = 1}^{s}\; w_{i}} = 1},$

and m positive real numbers {p₁, . . . , p_(m)} with

${{\sum\limits_{i = 1}^{m}\; p_{i}} = 1},$to determine a partition of

into m (mutually exclusive) sets {

₁, . . . ,

_(m)} such that for 1≦i≦m, the sum r_(i) of the w's in set

_(i) is

$r_{i} \equiv {\sum\limits_{w \in \omega_{i}}\; w} \approx {p_{i}.}$

One way to formulate this approximation problem is to think in terms ofdiscrete probability distributions. In that case, one seeks thepartition such that the K-L divergence of the probability distributionof the partitions {r₁, . . . , r_(m)} from the given probabilitydistribution {p₁, . . . , p_(m)} is minimized. That is,

$\sum\limits_{i = 1}^{m}\;{p_{i}\log\frac{p_{i}}{r_{i}}}$is minimized. Numerous other formulations are also possible.

Unfortunately, this is an NP-Hard problem in general. For the case ofm=2 and p₁=p₂=0.5, the decision theoretic problem of deciding whether

can be partitioned into 2 parts such that r₁=r₂=0.5 is NP-Complete.However, approximate and pseudo polynomial time methods are availablefor our use.

A Special Case: Maximum Likelihood Approach

If the packet rates of the stream bases are unknown, one can distributethe s stream bases b₁, . . . , b_(s) randomly among the m forwardinglinks with proportions p₁, . . . , p_(m). Then the probability Pr(s₁, .. . , s_(m)) that s_(i) stream bases are forwarded along the i^(th) linkfor 1≦i≦m is given by the multinomial distribution:

$\begin{pmatrix}s \\{s_{1},\ldots\mspace{14mu},s_{m}}\end{pmatrix}\quad$p₁ ^(s) ¹ × . . . ×p_(m) ^(s) ^(m) . One way to obtain the best set ofvalues for s₁, . . . , s_(m) is to find those values that maximize theprobability Pr(s₁, . . . , s_(m)). Known approaches provide a O(m²) timesolution to this problem.

FIG. 31 is a simplified view of node E 2605 of FIG. 26 illustrating oneembodiment. Node E 2605 includes a processor 900 and a memory 901.Processor 3101 and memory 3102 communicate via buses 3103 and 3104.Memory 3102 stores network topology information 904 according to awell-known protocol such as OSPF or ISIS. An FDAG set-up engine 905running on the processor 3101 uses the network topology information 904for computing Optimal FDAGs for each possible destination node of thenetwork. The FDAG set-up engine 905 uses the computed FDAGs to determinethe appropriate forwarding proportions for each destination node. Notethat in this case, the FDAG set-up engine 905 runs as a sub-routine ofthe SBSDM engine 3105.

The SBSDM engine 3105 constructs an SBSDM routing table 2700 thatapproximates the MNPF forwarding proportions. The SBSDM routing table2700 for the node 2605 is stored in memory 3102 for later use inforwarding of an information packet to its destination. Thus,approximately 60% of packets forwarded from node E 2605 destined fornode H 2608 are forwarded via link 2614 to node F 2606, whileapproximately 40% of packets forwarded from node E 2605 destined fornode H 2608 are forwarded via link 2615 to node G 2607.

Other MNPF capable nodes in the network each run their own instance ofthe FDAG set-up engine 905 to determine their own MNPF routing tables.These MNPF capable nodes can run other variants of MNPF, such as theAlias method and its accompanying RN routing table, described above.Note that SBSDM methods can be run on a subset of SBSDM capable routers,in which case these routers determine routing tables for other SBSDMcapable routers.

Although Multi-Neighbor Proportional Forwarding is described above inconnection with an embodiment in which a network router determinesoptimal paths and proportional forwarding parameters, in anotherembodiment computing optimal FDAGs for destination nodes and determiningMNPF routing table entries for network nodes is performed by a computingdevice other than one of the network routers, such as a laptop computer.The computing device can be connected to the network and assign, via itsconnection to the network, the appropriate MNPF routing table to eachnode that implements the MNPF routing method. Also, note that therouting tables can be transferred from the computing device other thanby network connection; for example, an MNPF routing table could betransferred to the appropriate node physically, using a storage devicesuch as a disk or Flash drive. More than one computing device could beused to compute some routing tables. Additionally, although maximizingthe flow rate of packets to determine optimal paths and proportionalforwarding parameters is described, variant methods may be employed todetermine MNPF parameters for forwarding packets along multiple acyclicpaths to destination nodes. Accordingly, various modifications,adaptations, and combinations of various features of the describedembodiments can be practiced without departing from the scope of theinvention as set forth in the claims.

What is claimed is:
 1. A method, comprising: (a) computing a ForwardingDirected Acyclic Graph (FDAG) rooted at a destination node in a networkof nodes, wherein the FDAG defines a plurality of routes for forwardingan information packet from any node in the network to the destinationnode, wherein the information packet includes destination data, whereinthe defined plurality of routes includes forwarding proportions forforwarding an information packet from at least one network node to oneof a plurality of adjacent network nodes; and (b) constructing aMulti-Neighbor Proportional Forwarding (MNPF) routing table for a firstnode in the network of nodes, wherein the first node forwards theinformation packet, wherein the first node has a plurality of allowedadjacent network nodes to which the information packet can be forwarded,wherein the MNPF routing table includes forwarding proportions thatdetermine selection of one of the plurality of allowed adjacent networknodes without reference to the source of the information packet, whereinat least some of the forwarding proportions in the MNPF routing tableare derived at least in part from said computed FDAG, wherein thenetwork of nodes is a mixed network, wherein at least one node forwardspackets according to an MNPF routing table, wherein at least one nodeforwards packets according to another type of routing table, wherein anFDAG for each destination node in the network is computed by at leastone node of the network, and wherein network topology information usedin computing the FDAG for each destination node includes information asto the type of routing table used by each node for forwarding packets.2. The method of claim 1, further comprising: forwarding an informationpacket from a node in the network of nodes to the destination node alonga loop-free path, wherein the information packet is forwarded at leastonce according to a forwarding proportion stored in an MNPF routingtable.
 3. The method of claim 1, wherein each of some nodes in thenetwork of nodes stores an MNPF routing table determined based at leastin part on an FDAG.
 4. The method of claim 1, wherein each network nodehas a forwarding link to each adjacent network node, wherein eachforwarding link has a capacity, and wherein a determining of forwardingproportions in the FDAG in (a) is based at least in part on thecapacities of the forwarding links to between adjacent network nodes. 5.The method of claim 1, wherein each network node has a forwarding linkto each adjacent network node, wherein each forwarding link has a linkmetric based on a link delay or a link transmission time, and wherein adetermining of forwarding proportions in the FDAG in (a) is based atleast in part on link metrics of the forwarding links to adjacentnetwork nodes.
 6. A method, comprising: (a) computing a ForwardingDirected Acyclic Graph (FDAG) rooted at a destination node in a networkof nodes, wherein the FDAG defines a plurality of routes for forwardingan information packet from at least one node in the network to thedestination node without reference to the source of the informationpacket, and wherein the information packet includes destination data;and (b) constructing a Multi-Neighbor Proportional Forwarding (MNPF)routing table for a first node in the network of nodes, wherein thefirst node forwards the information packet, wherein the first node has aplurality of allowed adjacent network nodes to which the informationpacket can be forwarded, wherein the MNPF routing table includesforwarding proportions that determine selection of one of the pluralityof allowed adjacent network nodes without reference to the source of theinformation packet, wherein at least some of the forwarding proportionsin the MNPF routing table are derived at least in part from saidcomputed FDAG, wherein the network of nodes is a mixed network, whereinat least one node forwards packets according to an MNPF routing table,wherein at least one node forwards packets according to another type ofrouting table, wherein an FDAG for each destination node in the networkis computed by at least one node of the network, and wherein the networktopology information used in computing the FDAG for each destinationnode includes information as to the type of routing table used by eachnode for forwarding packets.
 7. The method of claim 6, furthercomprising: forwarding an information packet from a node in the networkof nodes to the destination node along a loop-free path, wherein theinformation packet is forwarded at least once according to proportionalselection of a next adjacent node as determined by an MNPF routing tablestored in a forwarding node.
 8. The method of claim 6, wherein eachnetwork node has a forwarding link to each adjacent network node,wherein each forwarding link has a capacity, and wherein the computingof the FDAG in (a) is based at least in part on the capacities of theforwarding links to adjacent network nodes.
 9. The method of claim 6,wherein each network node has a forwarding link to each adjacent networknode, wherein each forwarding link has a link metric based on a linkdelay or a link transmission time, and wherein the computing of the FDAGin (a) is based at least in part on the link metrics of the forwardinglinks to adjacent network nodes.
 10. The method of claim 6, wherein theFDAG of (a) is computed by a device other than a node of the network.11. A method involving a mixed network, wherein the mixed networkincludes at least one node that forwards information packets destined toa destination node to a selected one of a plurality of adjacent nodes,wherein the mixed network includes at least one node that forwardsinformation packets destined to a destination node to only one adjacentnode, the method comprising: using a plurality of forwarding proportionsin constructing a Multi-Neighbor Proportional Forwarding (MNPF) routingtable for a first node of the mixed network, wherein the first node hasa plurality of allowed adjacent network nodes to which it can forward aninformation packet destined for a destination node, wherein the MNPFrouting table includes forwarding proportions that determine selectionof one of the plurality of allowed adjacent network nodes withoutreference to the source of the information packet, wherein theforwarding proportions used in constructing the MNPF routing table weredetermined by computing a Forwarding Directed Acyclic Graph (FDAG), andwherein the constructing of the MNPF routing table is performed by anetwork device.
 12. The method of claim 11, further comprising:performing said computing of the FDAG on the first node.
 13. The methodof claim 11, further comprising: using the MNPF routing table on thefirst node to route information packets out of the first node.
 14. Amethod involving a mixed network, wherein the mixed network includes atleast one node that forwards information packets destined to adestination node to a selected one of a plurality of adjacent nodes,wherein the mixed network includes at least one node that forwardsinformation packets destined to a destination node to only one adjacentnode, the method comprising: routing packets through a first node usinga Multi-Neighbor Proportional Forwarding (MNPF) routing table, whereinthe first node is a node of the mixed network and has a plurality ofallowed adjacent network nodes to which it can forward an informationpacket destined for a destination node, wherein the MNPF routing tableincludes forwarding proportions that determine selection of one of theplurality of allowed adjacent network nodes without reference to thesource of the information packet, and wherein the MNPF routing table wasconstructed based at least in part on forwarding proportions determinedby computing a Forwarding Directed Acyclic Graph (FDAG).
 15. The methodof claim 14, further comprising: performing said construction of theMNPF on the first node.
 16. The method of claim 14, further comprising:performing said computing of the FDAG on the first node.
 17. A methodinvolving a mixed network, wherein the mixed network includes at leastone node that forwards information packets destined to a destinationnode to a selected one of a plurality of adjacent nodes, wherein themixed network includes at least one node that does not forwardinformation packets destined to a destination node to a selected one ofa plurality of adjacent nodes, the method comprising: a step for using aplurality of forwarding proportions in constructing a Multi-NeighborProportional Forwarding (MNPF) routing table for a first node of themixed network, wherein the first node has a plurality of allowedadjacent network nodes to which it can forward an information packetdestined for a destination node, wherein the MNPF routing table includesforwarding proportions that determine selection of one of the pluralityof allowed adjacent network nodes without reference to the source of theinformation packet being routed, wherein the forwarding proportions usedin constructing the MNPF routing table were determined as part of acomputing of a Forwarding Directed Acyclic Graph (FDAG), and wherein theconstructing of the MNPF routing table is performed by a network device.18. The method of claim 17, wherein the FDAG is rooted at a destinationnode in the mixed network, wherein the FDAG defines one or more routesfor forwarding an information packet from any node in the mixed networkto the destination node, wherein there is at least one node in the mixednetwork for which the FDAG defines a plurality of routes for forwardingan information packet destined for the destination node out of said atleast one node, and wherein the FDAG includes a forwarding proportionassociated with each of the defined routes.